Spam filtering by quantitative profiles
نویسندگان
چکیده
Instead of the “bag-of-words” representation, in the quantitative profile approach to spam filtering and email categorization, an email is represented by an mdimensional vector of numbers, with m fixed in advance. Inspired by email shape analysis proposed recently by Sroufe et al., two instances of quantitative profiles are considered: line profile and character profile. Performance of these profiles is studied on the TREC 2007, CEAS 2008 and a private corpuses. At low computational costs, the two quantitative profiles achieve performance that is at least comparable to that of heuristic rules and naive Bayes.
منابع مشابه
Autonomous Personal Filtering Improves Global Spam Filter Performance
Using two email streams, we show that a personal filter trained exclusively on user feedback substantially outperforms (p ≈ 0.000) three industry-leading global spam filters not using feedback. We show that autonomous personal filters, trained on the output from a global spam filter rather than user feedback, substantially outperform (p ≈ 0.000) the global filter, if by a somewhat smaller facto...
متن کاملارائه روشی مناسب برای دسته بندی نامه های الکترونیکی تبلیغاتی بر مبنای پروفایل کاربران
In general, Spam is related to satisfy or not satisfy the client and isn’t related to the content of the client’s email. According to this definition, problems arise in the field of marketing and advertising for example, it is possible that some of the advertising emails become spam for some users, and not spam for others. To deal with this problem, many researchers design an anti-s...
متن کاملArtificial Immune System for Collaborative Spam Filtering
Artificial immune systems (AIS) use the concepts and algorithms inspired by the theory of how the human immune system works. This document presents the design and initial evaluation of a new artificial immune system for collaborative spam filtering. Collaborative spam filtering allows for the detection of not-previously-seen spam content, by exploiting its bulkiness. Our system uses two novel a...
متن کاملStudy of Static Classification of Social Spam Profiles in MySpace
Reaching hundreds of millions of users, major social networks have become important target media for spammers. Although practical techniques such as collaborative filters and behavioral analysis are able to reduce spam, they have an inherent lag (to collect sufficient data on the spammer) that also limits their effectiveness. Through an experimental study of over 1.9 million MySpace profiles, w...
متن کاملSpam filtering using the Kolgomorov complexity analysis
One of the most irrelevant side effects of e-commerce technology is the development of spamming as an e-marketing technique. Spam e-mails (or unsolicited commercial e-mails) induce a burden for everybody having an electronic mailbox: detecting and filtering spam is then a challenging task and a lot of approaches have been developed to identify spam before it is posted in the end user’s mailbox....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1201.0040 شماره
صفحات -
تاریخ انتشار 2011